Universal nucleotides for nucleic acid analysis

ABSTRACT

The invention includes methods and kits for making and analyzing primer extension products incorporating one or more universal bases, including methods and kits for nucleic acid sequencing and microsatellite analysis.

This is a divisional of U.S. patent application Ser. No. 10/290,672,filed Nov. 7, 2002, which claims the benefit of U.S. ProvisionalApplication No. 60/336,966, filed Nov. 7, 2001, all of which areincorporated herein by reference for any purpose.

FIELD OF THE INVENTION

The present invention generally relates to universal nucleotides thatcan be incorporated into a polynucleotide strand during nucleic acidsynthesis.

BACKGROUND OF THE INVENTION

Primer extension reactions are widely used in modern molecular biology.For example, in Sanger sequencing, an oligonucleotide primer is annealedto a 5′ end of a template, and deoxyribonucleotide triphosphates(dNTPs), polymerase, and four dideoxynucleotide terminators are added toform a reaction composition (the four teminators are either added toseparate reactions or together in one reaction), and the reactioncomposition is incubated under appropriate conditions to achieve primerextension and termination.

Analysis of microsatellites, including Variable Number of Tandem Repeats(VNTRs) and Short Tandem Repeats (STRs), is another widely used methodemploying a primer extension reaction. STRs are sequences of two toseven nucleotides that are tandemly repeated at one or more locations inthe genome. The number of tandem repeats varies from individual toindividual. For certain genetic analysis techniques, STRs are amplifiedby PCR using specific primers flanking the repeat region and the numberof repeats is determined. In certain techniques, the determination ismade using size differentiation, e.g., by electrophoresis, massspectroscopy, or chromatography.

SUMMARY OF THE INVENTION

In certain embodiments, a method of sequencing at least one targetnucleic acid template is provided. Certain such embodiments involveforming a reaction composition comprising at least one target nucleicacid template, at least one primer, at least one polymerase, at leastone universal nucleotide, and at least one specific terminator thatcomprises a label. The reaction composition is incubated to generate atleast one primer extension product comprising the at least one universalnucleotide and the at least one specific terminator. One or more of theat least one primer extension products is separated using at least onemobility-dependent analysis technique (MDAT). One or more of the atleast one primer extension product is then detected.

In certain embodiments, a method of sequencing at least one targetnucleic acid template comprises forming at least one reactioncomposition that comprises at least one target nucleic acid template, atleast one primer comprising a label, at least one polymerase, at leastone universal nucleotide, and at least one specific terminator. Thereaction composition is incubated to generate at least one primerextension product comprising one or more of the at least one universalnucleotides and one or more of the at least one specific terminators.One or more of the at least one primer extension product is separatedusing at least one mobility-dependent analysis technique (MDAT). One ormore of the at least one primer extension products is then detected.

In certain embodiments, the method of sequencing at least one targetnucleic acid template further comprises releasing the at least oneprimer extension product from the at least one target nucleic acidtemplate, and repeating the incubating and releasing at least oneadditional time.

Certain embodiments of the invention provide a method for detecting aplurality of primer extension products. According to certainembodiments, the method includes forming a reaction compositioncomprising at least one template, at least one primer comprising alabel, at least one polymerase, and at least one universal nucleotide.The reaction composition is incubated under appropriate conditions togenerate at least one primer extension product comprising one or more ofthe at least one universal nucleotides. The primer extension product isreleased from the at least one target nucleic acid template, and theincubation and releasing procedures are repeated to produce a pluralityof primer extension products. One or more of the plurality of primerextension products are separated using an MDAT. One or more of theplurality of primer extension products are then detected.

According to certain embodiments, a method for detecting a plurality ofsecond primer extension products is provided. In certain embodiments,the method comprises forming a first reaction composition comprising atleast one target nucleic acid template, at least one first primer, atleast one specific nucleotide, and at least one first polymerase,wherein the first reaction composition does not include universalnucleotides. The first reaction composition is incubated underappropriate conditions to generate at least one first primer extensionproduct. The at least one first primer extension product is releasedfrom the at least one target nucleic acid template, and the incubationand releasing procedures are repeated to produce a plurality of firstprimer extension products. In certain embodiments, the method furthercomprises forming a second reaction composition comprising one or moreof the at least one first primer extension product, at least one secondprimer comprising a label, at least one second polymerase, and at leastone universal nucleotide. The second reaction composition is incubatedunder appropriate conditions to generate at least one second primerextension product. The at least one second primer extension product isreleased from the at least one first primer extension product, and theincubation and releasing procedures are repeated to produce a pluralityof second primer extension products. In certain embodiments, the methodincludes separating one or more of the plurality of second primerextension products using an MDAT. One or more of the plurality of secondprimer extension products are then detected.

According to certain embodiments, a kit is provided for sequencing atarget nucleic acid template. In certain embodiments, the kit comprisesat least one polymerase, at least one universal nucleotide, and at leastone specific terminator.

In certain embodiments, a kit is provided for detecting a short tandemrepeat in a target nucleic acid template. In certain embodiments, thekit comprises at least one universal nucleotide, at least onepolymerase, and at least one primer comprising a sequence that iscomplementary to a sequence adjacent to a short tandem repeat in thetarget nucleic acid template.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: FIG. 1A schematically illustrates primer extension proceeding inthe presence of natural nucleotides, e.g., A (dATP), T (dTTP), G (dGTP),and C (dCTP), with a tetranucleotide primer (hatched bar). Eachnucleotide adds to the primer extension product (SEQ ID NO: 2) in atemplate sequence dependent manner, e.g., based on Watson-Crick basepairing wherein A pairs with T, and G pairs with C. In certain exemplaryembodiments, shown in FIG. 1B, primer extension in the presence ofuniversal nucleotide “X” results in the incorporation of the universalnucleotide into the primer extension product (SEQ ID NO: 3) in atemplate sequence-independent manner. Thus, universal nucleotide “X” canpair with any nucleotide present in the template. In certain exemplaryembodiments, shown in FIG. 1C, primer extension in the presence ofhypothetical universal nucleotide “Z” results in the incorporation ofthe universal nucleotide into the primer extension product (SEQ ID NO:4) opposite certain nucleotides in the template, but not for others.Here, for example, Z pairs with a G and C nucleotides in the template,but Z does not pair with A or T. Thus, Z is a universal nucleotide withrespect to G and C, but not with respect to A or T. In certain exemplaryembodiments, shown in FIG. 1D, primer extension in the presence ofhypothetical universal nucleotide “Y” results in the incorporation ofthe universal nucleotide into the primer extension product (SEQ ID NO:5) opposite A, T, and C in the template, but not opposite G.

FIG. 2: FIG. 2 shows the structures of several non-limiting exemplaryuniversal nucleotides. (A) 2′-deoxy-7-azaindole-5′-triphosphate(d7AITP), (B) 2′-deoxy-6-methyl-7-azaindole-5′-triphosphate (dM7AITP),(C) 2′-deoxy-pyrrollpyrizine-5′-triphosphate (dPPTP), (D)2′-deoxy-imidizopyridine-5′-triphosphate (dlmPyTP), (E)2′-deoxy-isocarbostyril-5′-triphosphate (dlCSTP), (F)2′-deoxy-propynyl-7-azaindole-5′-triphosphate (dP7AITP), (G)2′-deoxy-propynylisocarbostyril-5′-triphosphate (dPICSTP), and (H)2′-deoxy-allenyl-7-azaindole-5′-triphosphate (dA7AITP). “R”, as used inthis figure, is the deoxyribose moiety of the nucleotide.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

It is to be understood that both the foregoing general description andthe following detailed description are exemplary and explanatory onlyand are not restrictive of the invention. In this application, the useof the singular includes the plural unless specifically statedotherwise. In this application, the use of “or” means “and/or” unlessstated otherwise. Furthermore, the use of the term “including”, as wellas other forms, such as “includes” and “included”, is not limiting.

The section headings used herein are for organizational purposes onlyand are not to be construed as limiting the subject matter described.All documents, or portions of documents, cited in this application,including but not limited to patents, patent applications, articles,books, and treatises, are hereby expressly incorporated by reference intheir entirety for any purpose.

Definitions

The term “nucleotide base”, as used herein, refers to a substituted orunsubstituted aromatic ring or rings. In certain embodiments, thearomatic ring or rings contain at least one nitrogen atom. In certainembodiments, the nucleotide base is capable of forming Watson-Crickand/or Hoogsteen hydrogen bonds with an appropriately complementarynucleotide base. Exemplary nucleotide bases and analogs thereof include,but are not limited to, naturally occurring nucleotide bases adenine,guanine, cytosine, uracil, thymine, and analogs of the naturallyoccurring nucleotide bases, e.g., 7-deazaadenine, 7-deazaguanine,7-deaza-8-azaguanine, 7-deaza-8-azaadenine, N6-Δ2-isopentenyladenine(6iA), N6-Δ2-isopentenyl-2-methylthioadenine (2ms6iA),N2-dimethylguanine (dmG), 7-methylguanine (7mG), inosine, nebularine,2-aminopurine, 2-amino-6-chloropurine, 2,6-diaminopurine, hypoxanthine,pseudouridine, pseudocytosine, pseudoisocytosine, 5-propynylcytosine,isocytosine, isoguanine, 7-deazaguanine, 2-thiopyrimidine,6-thioguanine, 4-thiothymine, 4-thiouracil, O⁶-methylguanine,N⁶-methyladenine, O⁴-methylthymine, 5,6-dihydrothymine,5,6-dihydrouracil, pyrazolo[3,4-D]pyrimidines (see, e.g., U.S. Pat. Nos.6,143,877 and 6,127,121 and PCT published application WO 01/38584),ethenoadenine, indoles such as nitroindole and 4-methylindole, andpyrroles such as nitropyrrole. In certain embodiments, nucleotide basesare universal nucleotide bases. Certain exemplary nucleotide bases canbe found, e.g., in Fasman, 1989, Practical Handbook of Biochemistry andMolecular Biology, pp. 385-394, CRC Press, Boca Raton, Fla., and thereferences cited therein.

The term “nucleotide”, as used herein, refers to a compound comprising anucleotide base linked to the C-1′ carbon of a sugar, such as ribose,arabinose, xylose, and pyranose, and sugar analogs thereof. The termnucleotide also encompasses nucleotide analogs. The sugar may besubstituted or unsubstituted. Substituted ribose sugars include, but arenot limited to, those riboses in which one or more of the carbon atoms,for example the 2′-carbon atom, is substituted with one or more of thesame or different Cl, F, —R, —OR, —NR₂ or halogen groups, where each Ris independently H, C₁-C₆ alkyl or C₅-C₁₄ aryl. Exemplary ribosesinclude, but are not limited to, 2′-(C1-C6)alkoxyribose,2′-(C5-C14)aryloxyribose, 2′,3′-didehydroribose, 2′-deoxy-3′-haloribose,2′-deoxy-3′-fluororibose, 2′-deoxy-3′-chlororibose,2′-deoxy-3′-aminoribose, 2′-deoxy-3′-(C1-C6)alkylribose,2′-deoxy-3′-(C1-C6)alkoxyribose and 2′-deoxy-3′-(C5-C14)aryloxyribose,ribose, 2′-deoxyribose, 2′,3′-dideoxyribose, 2′-haloribose,2′-fluororibose, 2′-chlororibose, and 2′-alkylribose, e.g., 2′-O-methyl,4′-α-anomeric nucleotides, 1′-α-anomeric nucleotides, 2′-4′- and3′-4′-linked and other “locked” or “LNA”, bicyclic sugar modifications(see, e.g., PCT published application nos. WO 98/22489, WO 98/39352;,and WO 99/14226). Exemplary LNA sugar analogs within a polynucleotideinclude, but are not limited to, the structures:

where B is any nucleotide base.

Modifications at the 2′- or 3′-position of ribose include, but are notlimited to, hydrogen, hydroxy, methoxy, ethoxy, allyloxy, isopropoxy,butoxy, isobutoxy, methoxyethyl, alkoxy, phenoxy, azido, amino,alkylamino, fluoro, chloro and bromo. Nucleotides include, but are notlimited to, the natural D optical isomer, as well as the L opticalisomer forms (see, e.g., Garbesi (1993) Nucl. Acids Res. 21:4159-65;Fujimori (1990) J. Amer. Chem. Soc. 112:7435; Urata, (1993) NucleicAcids Symposium Ser. No. 29:69-70). When the nucleotide base is purine,e.g. A or G, the ribose sugar is attached to the N⁹-position of thenucleotide base. When the nucleotide base is pyrimidine, e.g. C, T or U,the pentose sugar is attached to the N¹-position of the nucleotide base,except for pseudouridines, in which the pentose sugar is attached to theC5 position of the uracil nucleotide base (see, e.g., Kornberg andBaker, (1992) DNA Replication, 2^(nd) Ed., Freeman, San Francisco,Calif.).

One or more of the pentose carbons of a nucleotide may be substitutedwith a phosphate ester having the formula:

where α is an integer from 0 to 4. In certain embodiments, a is 2 andthe phosphate ester is attached to the 3′- or 5′-carbon of the pentose.In certain embodiments, the nucleotides are those in which thenucleotide base is a purine, a 7-deazapurine, a pyrimidine, a universalnucleotide base, a specific nucleotide base, or an analog thereof.“Nucleotide 5′-triphosphate” refers to a nucleotide with a triphosphateester group at the 5′ position, and are sometimes denoted as “NTP”, or“dNTP” and “ddNTP” to particularly point out the structural features ofthe ribose sugar. The triphosphate ester group may include sulfursubstitutions for the various oxygens, e.g. α-thio-nucleotide5′-triphosphates. For a review of nucleotide chemistry, see, e.g.,Shabarova, Z. and Bogdanov, A. Advanced Organic Chemistry of NucleicAcids, VCH, New York, 1994.

The term “nucleotide analog”, as used herein, refers to embodiments inwhich the pentose sugar and/or the nucleotide base and/or one or more ofthe phosphate esters of a nucleotide may be replaced with its respectiveanalog. In certain embodiments, exemplary pentose sugar analogs arethose described above. In certain embodiments, the nucleotide analogshave a nucleotide base analog as described above. In certainembodiments, exemplary phosphate ester analogs include, but are notlimited to, alkylphosphonates, methylphosphonates, phosphoramidates,phosphotriesters, phosphorothioates, phosphorodithioates,phosphoroselenoates, phosphorodiselenoates, phosphoroanilothioates,phosphoroanilidates, phosphoroamidates, boronophosphates, etc., and mayinclude associated counterions.

Also included within the definition of “nucleotide analog” arenucleotide analog monomers which can be polymerized into polynucleotideanalogs in which the DNA/RNA phosphate ester and/or sugar phosphateester backbone is replaced with a different type of internucleotidelinkage. Exemplary polynucleotide analogs include, but are not limitedto, peptide nucleic acids, in which the sugar phosphate backbone of thepolynucleotide is replaced by a peptide backbone.

An “extendable nucleotide” is a nucleotide which is: (i) capable ofbeing enzymatically or synthetically incorporated onto the terminus of apolynucleotide chain, and (ii) capable of supporting further enzymaticor synthetic extension. Extendable nucleotides include nucleotides thathave already been enzymatically or synthetically incorporated into apolynucleotide chain, and have either supported further enzymatic orsynthetic extension, or are capable of supporting further enzymatic orsynthetic extension. Extendable nucleotides include, but are not limitedto, nucleotide 5′-triphosphates, e.g., dNTP and NTP, phosphoramiditessuitable for chemical synthesis of polynucleotides, and nucleotide unitsin a polynucleotide chain that have already been incorporatedenzymatically or chemically. Extendable nucleotides include, but are notlimited to, specific nucleotides, nucleotide analogs, and universalnucleotides.

The term “type” as used herein with respect to nucleotides, refers to astructurally distinct nucleotide. For example, A and G are differenttypes of nucleotides. Similarly, G and I are different types ofnucleotides, although they both may pair with C. Nucleotides that aredifferent “pairing types”, as used herein, are both structurallydistinct and have different pairing specificities. For example, A and Gare different pairing types of nucleotides, because they arestructurally distinct, and they have different pairing specificities (Tversus C). G and I are not different pairing types of nucleotides whencompared to one another, because, although they are structurallydistinct, they both pair specifically with C. The pairing type of auniversal nucleotide, as defined below, is determined by the combinationof nucleotides with which it pairs. For example, hypothetical universalnucleotide Y pairs with A, C, and T, and hypothetical universalnucleotide X, which is structurally distinct from Y, pairs with C and G.X and Y are therefore different types of universal nucleotides, and Xand Y are also different pairing types of universal nucleotides,although they both may pair with C.

The term “specific nucleotide”, as used herein, refers to an extendablenucleotide that can be incorporated into a polynucleotide strand by apolymerase during a primer extension reaction, and will not pair withmore than one different pairing type of nucleotide in a template strand,where the nucleotide in the template strand is not a universalnucleotide. For example, C is a specific nucleotide, even though itpairs specifically with both G and I, since G and I are not differentpairing types of nucleotides. Similarly, C is a specific nucleotide,even though it pairs with both G and a universal nucleotide. Specificnucleotides may be naturally-occurring nucleotides, e.g., adenine,cytosine, guanine, thymine or uracil. Specific nucleotides may also benucleotide analogs that pair in a template sequence-specific manner.

The term “universal nucleotide”, as used herein, refers to an extendablenucleotide that can be incorporated into a polynucleotide strand by apolymerase during a primer extension reaction, and pairs with more thanone pairing type of specific nucleotide. In certain embodiments, theuniversal nucleotide pairs with any specific nucleotide. In certainembodiments, the universal nucleotide pairs with four pairing types ofspecific nucleotides or analogs thereof. In certain embodiments, theuniversal nucleotide pairs with three pairing types of specificnucleotides or analogs thereof. In certain embodiments, the universalnucleotide pairs with two pairing types of specific nucleotides oranalogs thereof. The pairing of a universal nucleotide with two or morepairing types of specific nucleotides will be referred to asnon-template sequence-specific pairing.

The terms “universal nucleotide base” and “universal base” are usedinterchangeably and, as used herein, refer to the base portion of auniversal nucleotide. The universal nucleotide base may include anaromatic ring moiety, which may or may not contain nitrogen atoms. Incertain embodiments, a universal base may be covalently attached to theC-1′ carbon of a pentose sugar to make a universal nucleotide. Incertain embodiments, a universal nucleotide base does not hydrogen bondspecifically with another nucleotide base. In certain embodiments, auniversal nucleotide base may interact with adjacent nucleotide bases onthe same nucleic acid strand by hydrophobic stacking. Universalnucleotides include, but are not limited to,2′-deoxy-7-azaindole-5′-triphosphate (d7AITP),2′-deoxy-isocarbostyril-5′-triphosphate (dlCSTP),2′-deoxy-propynylisocarbostyril-5′-triphosphate (dPICSTP),2′-deoxy-6-methyl-7-azaindole-5′-triphosphate (dM7AITP),2′-deoxy-imidizopyridine-5′-triphosphate (dlmPyTP),2′-deoxy-pyrrollpyrizine-5′-triphosphate (dPPTP),2′-deoxy-propynyl-7-azaindole-5′-triphosphate (dP7AITP), or2′-deoxy-allenyl-7-azaindole-5′-triphosphate (dA7AITP).

The term “nucleotide terminator” or “terminator”, as used herein, refersto an enzymatically-incorporable nucleotide, which does not supportincorporation of subsequent nucleotides in a primer extension reaction.A terminator is therefore not an extendable nucleotide. In certainembodiments, terminators are those in which the nucleotide is a purine,a 7-deaza-purine, a pyrimidine, a specific nucleotide or nucleotideanalog and the sugar moiety is a pentose which includes a 3′-substituentthat blocks further synthesis, such as a dideoxynucleotide triphosphate(ddNTP). In certain embodiments, substituents that block furthersynthesis include, but are not limited to, amino, deoxy, halogen, alkoxyand aryloxy groups. Exemplary terminators include, but are not limitedto, those in which the sugar-phosphate ester moiety is3′-(C1-C6)alkylribose-5′-triphosphate,2′-deoxy-3′-(C1-C6)alkylribose-5′-triphosphate,2′-deoxy-3′-(C1-C6)alkoxyribose-5-triphosphate,2′-deoxy-3′-(C5-C14)aryloxyribose-5′-triphosphate,2′-deoxy-3′-haloribose-5′-triphosphate,2′-deoxy-3′-aminoribose-5′-triphosphate,2′,3′-dideoxyribose-5′-triphosphate or2′,3′-didehydroribose-5′-triphosphate. In certain embodiments, ddNTPs,such as ddATP, ddCTP, ddGTP, ddITP, and ddTTP, may be used for chaintermination.

In certain embodiments, a terminator is a “specific terminator”, whichis incorporated by polymerase into a primer extension product opposite aparticular nucleotide in the template. Specific terminators include, butare not limited to, T terminators, including ddTTP, which incorporateopposite an adenine, or adenine analog, in a template; A terminators,including ddATP, which incorporate opposite a thymine, uracil, or ananalog of thymine or uracil, in the template; C terminators, includingddCTP, which incorporate opposite a guanine, or guanine analog, in thetemplate; and G terminators, including ddGTP, which incorporate oppositea cytosine, or cytosine analog, in the template.

The term “label” refers to any moiety which can be attached to amolecule and: (i) provides a detectable signal; (ii) interacts with asecond label to modify the detectable signal provided by the secondlabel, e.g. FRET (Fluorescent Resonance Energy Transfer); (iii)stabilizes hybridization, e.g., duplex formation; or (iv) provides amember of a binding complex or affinity set, e.g., affinity,antibody/antigen, ionic complexation, hapten/ligand, e.g. biotin/avidin.Labeling can be accomplished using any one of a large number of knowntechniques employing known labels, linkages, linking groups, reagents,reaction conditions, and analysis and purification methods. Labelsinclude, but are not limited to, light-emitting or light-absorbingcompounds which generate or quench a detectable fluorescent,chemiluminescent, or bioluminescent signal (see, e.g., Kricka, L. inNonisotopic DNA Probe Techniques (1992), Academic Press, San Diego, pp.3-28). Fluorescent reporter dyes useful for labelling biomoleculesinclude, but are not limited to, fluoresceins (see, e.g., U.S. Pat. Nos.5,188,934; 6,008,379; and 6,020,481), rhodamines (see, e.g., U.S. Pat.Nos. 5,366,860; 5,847,162; 5,936,087; 6,051,719; and 6,191,278),benzophenoxazines (see, e.g., U.S. Pat. No. 6,140,500), energy-transferfluorescent dyes, comprising pairs of donors and acceptors (see, e.g.,U.S. Pat. Nos. 5,863,727; 5,800,996; and 5,945,526), and cyanines (see,e.g., Kubista, WO 97/45539), as well as any other fluorescent labelcapable of generating a detectable signal. Examples of fluorescein dyesinclude, but are not limited to, 6-carboxyfluorescein;2′,4′,1,4,-tetrachlorofluorescein; and2′,4′,5′,7′,1,4-hexachlorofluorescein. Labels also include, but are notlimited to, semiconductor nanocrystals, or quantum dots (see, e.g., U.S.Pat. Nos. 5,990,479 and 6,207,392 B1; Han et al. Nature Biotech. 19:631-635).

A class of labels are hybridization-stabilizing moieties which serve toenhance, stabilize, or influence hybridization of duplexes, e.g.intercalators, minor-groove binders, and cross-linking functional groups(see, e.g., Blackburn, G. and Gait, M. Eds. “DNA and RNA structure” inNucleic Acids in Chemistry and Biology, 2^(nd) Edition, (1996) OxfordUniversity Press, pp. 15-81). Yet another class of labels effect theseparation or immobilization of a molecule by specific or non-specificcapture, for example biotin, digoxigenin, and other haptens (see, e.g.,Andrus, A. “Chemical methods for 5′ non-isotopic labeling of PCR probesand primers” (1995) in PCR 2: A Practical Approach, Oxford UniversityPress, Oxford, pp. 39-54). Non-radioactive labelling methods,techniques, and reagents are reviewed in: Non-Radioactive Labelling, APractical Introduction, Garman, A. J. (1997) Academic Press, San Diego.

Labels may be “detectably different”, which means that they aredistinguishable from one another by at least one detection method.Detectably different labels include, but are not limited to, labels thatemit light of different wavelengths, labels that absorb light ofdifferent wavelengths, labels that have different fluorescent decaylifetimes, labels that have different spectral signatures, labels thathave different radioactive decay properties, labels of different charge,and labels of different size.

The term “labeled terminator”, as used herein, refers to a terminatorthat is physically joined to a label. The linkage to the label is at asite or sites on the terminator that do not prevent the incorporation ofthe terminator by a polymerase into a polynucleotide.

As used herein, the term “target nucleic acid template” refers to anucleic acid sequence that serves as a template for a primer extensionreaction. Target nucleic acid templates include, but are not limited to,genomic DNA, including mitochondrial DNA and nucleolar DNA, cDNA,synthetic DNA, plasmid DNA, yeast artificial chromosomal DNA (YAC),bacterial artificial chromosomal DNA (BAC), and other extrachromosomalDNA, and primer extension products. Target nucleic acid templates alsoinclude, but are not limited to, RNA, synthetic RNA, mRNA, tRNA, andanalogs of both RNA and DNA, such as peptide nucleic acids (PNA). Incertain embodiments, target nucleic acid templates do not containuniversal nucleotides.

Different target nucleic acid templates may be different portions of asingle contiguous nucleic acid or may be on different nucleic acids.Different portions of a single contiguous nucleic acid may overlap.

“Primer” as used herein refers to a polynucleotide or oligonucleotidethat has a free 3′—OH (or functional equivalent thereof) that can beextended by at least one nucleotide in a primer extension reactioncatalyzed by a polymerase. In certain embodiments, primers do notcontain universal nucleotides. In certain embodiments, primers may be ofvirtually any length, provided they are sufficiently long to hybridizeto a polynucleotide of interest in the environment in which primerextension is to take place. In certain embodiments, primers are at least14 nucleotides in length. Primers may be specific for a particularsequence, or, alternatively, may be degenerate, e.g., specific for a setof sequences.

The terms “primer extension” and “primer extension reaction” are usedinterchangeably, and refer to a process of adding one or morenucleotides to a nucleic acid primer, or to a primer extension product,using a polymerase, a template, and one or more nucleotides. In certainembodiments, a primer extension reaction includes at least one universalnucleotide. In other words, it includes at least one type of universalnucleotide, although it may include many molecules of each type ofuniversal nucleotide. In certain embodiments, a primer extensionreaction includes one type of universal nucleotide, although it mayinclude many molecules of that type of universal nucleotide.

A “primer extension product” is produced when one or more nucleotideshas been added to a primer in a primer extension reaction. In certainembodiments, a primer extension product includes one type of universalnucleotide. In certain embodiments, a primer extension product includesmore than one type of universal nucleotide. In certain embodiments, aprimer extension product is comprised of a 5′ sequence of specificnucleotides followed by one or more universal nucleotides. In certainembodiments, the 5′ sequence of specific nucleotides is at least 14nucleotides in length. A primer extension product may serve as a targetnucleic acid template in subsequent extension reactions. A primerextension product may include a terminator.

As used herein, the terms “polynucleotide”, “oligonucleotide”, and“nucleic acid” are used interchangeably and mean single-stranded anddouble-stranded polymers of nucleotide monomers, including2′-deoxyribonucleotides (DNA) and ribonucleotides (RNA) linked byinternucleotide phosphodiester bond linkages, or internucleotideanalogs, and associated counter ions, e.g., H⁺, NH₄ ⁺, trialkylammonium,Mg²⁺, Na⁺ and the like. A polynucleotide may be composed entirely ofdeoxyribonucleotides, entirely of ribonucleotides, or chimeric mixturesthereof. The nucleotide monomer units may comprise any of thenucleotides described herein, including, but not limited to, specificnucleotides, nucleotide analogs, and universal nucleotides.Polynucleotides typically range in size from a few monomeric units, e.g.5-40 when they are sometimes referred to in the art as oligonucleotides,to several thousands of monomeric nucleotide units. Unless denotedotherwise, whenever a polynucleotide sequence is represented, it will beunderstood that the nucleotides are in 5′ to 3′ order from left to rightand that “A” denotes deoxyadenosine or an analog thereof, “C” denotesdeoxycytidine or an analog thereof, “G” denotes deoxyguanosine or ananalog thereof, and “T” denotes thymidine or an analog thereof, unlessotherwise noted.

Polynucleotides may be composed of a single type of sugar moiety, e.g.,as in the case of RNA and DNA, or mixtures of different sugar moieties,e.g., as in the case of RNA/DNA chimeras. In certain embodiments,nucleic acids are ribopolynucleotides and 2′-deoxyribopolynucleotidesaccording to the structural formulae below:

wherein each B is independently the base moiety of a nucleotide, e.g., apurine, a 7-deazapurine, a pyrimidine, a specific nucleotide, or auniversal nucleotide; each m defines the length of the respectivenucleic acid and can range from zero to thousands, tens of thousands, oreven more; each R is independently selected from the group comprisinghydrogen, hydroxyl, halogen, —R″, —OR″, and —NR″R″, where each R″ isindependently (C₁-C₆) alkyl or (C₅-C14) aryl, or two adjacent Rs may betaken together to form a bond such that the ribose sugar is2′,3′-didehydroribose, and each R′ may be independently hydroxyl or

where α is zero, one or two.

In certain embodiments of the ribopolynucleotides and2′-deoxyribopolynucleotides illustrated above, the nucleotide bases Bare covalently attached to the C1′ carbon of the sugar moiety aspreviously described.

The terms “nucleic acid”, “polynucleotide”, and “oligonucleotide” mayalso include nucleic acid analogs, polynucleotide analogs, andoligonucleotide analogs. The terms “nucleic acid analog”,“polynucleotide analog” and “oligonucleotide analog” are usedinterchangeably and, as used herein, refer to a polynucleotide thatcontains at least one nucleotide analog and/or at least one phosphateester analog and/or at least one pentose sugar analog. Also includedwithin the definition of polynucleotide analogs are polynucleotides inwhich the phosphate ester and/or sugar phosphate ester linkages arereplaced with other types of linkages, such as N-(2-aminoethyl)-glycineamides and other amides (see, e.g., Nielsen et al., 1991, Science254:1497-1500; WO 92/20702; U.S. Pat. No. 5,719,262; U.S. Pat. No.5,698,685;); morpholinos (see, e.g., U.S. Pat. No. 5,698,685; U.S. Pat.No. 5,378,841; U.S. Pat. No. 5,185,144); carbamates (see, e.g., Stirchak& Summerton, 1987, J. Org. Chem. 52: 4202); methylene(methylimino) (see,e.g., Vasseur et al., 1992, J. Am. Chem. Soc. 114: 4006);3′-thioformacetals (see, e.g., Jones et al., 1993, J. Org. Chem. 58:2983); sulfamates (see, e.g., U.S. Pat. No. 5,470,967);2-aminoethylglycine, commonly referred to as PNA (see, e.g., Buchardt,WO 92/20702; Nielsen (1991) Science 254:1497-1500); and others (see,e.g., U.S. Pat. No. 5,817,781; Frier & Altman, 1997, Nucl. Acids Res.25:4429 and the references cited therein). Phosphate ester analogsinclude, but are not limited to, (i) C₁-C₄ alkylphosphonate, e.g.methylphosphonate; (ii) phosphoramidate; (iii) C₁-C₆alkyl-phosphotriester; (iv) phosphorothioate; and (v)phosphorodithioate.

The terms “annealing” and “hybridization” are used interchangeably andmean the base-pairing interaction of one nucleic acid with anothernucleic acid that results in formation of a duplex, triplex, or otherhigher-ordered structure. When universal nucleotides are not involved,in certain embodiments, the primary interaction is base specific, e.g.,A/T and G/C, by Watson/Crick and Hoogsteen-type hydrogen bonding.Base-stacking and hydrophobic interactions may also contribute to duplexstability.

The term “variant” as used herein refers to any alteration of a protein,including, but not limited to, changes in amino acid sequence,substitutions of one or more amino acids, addition of one or more aminoacids, deletion of one or more amino acids, and alterations to the aminoacids themselves. In certain embodiments, the changes involveconservative amino acid substitutions. Conservative amino acidsubstitution may involve replacing one amino acid with another that has,e.g., similar hydrophobicity, hydrophilicity, charge, or aromaticity. Incertain embodiments, conservative amino acid substitutions may be madeon the basis of similar hydropathic indices. A hydropathic index takesinto account the hydrophobicity and charge characteristics of an aminoacid, and in certain embodiments, may be used as a guide for selectingconservative amino acid substitutions. The hydropathic index isdiscussed, e.g., in Kyte et al., J. Mol. Biol., 157:105-131 (1982). Itis understood in the art that conservative amino acid substitutions maybe made on the basis of any of the aforementioned characteristics.

Alterations to the amino acids may include, but are not limited to,glycosylation, methylation, phosphorylation, biotinylation, and anycovalent and noncovalent additions to a protein that do not result in achange in amino acid sequence. “Amino acid” as used herein refers to anyamino acid, natural or nonnatural, that may be incorporated, eitherenzymatically or synthetically, into a polypeptide or protein.

As used herein, “mobility-dependent analysis technique” or “MDAT” meansan analytical technique based on differential rates of migration amongdifferent analyte types. Exemplary mobility-dependent analysistechniques include electrophoresis, chromatography, mass spectroscopy,sedimentation, e.g., gradient centrifugation, field-flow fractionation,multi-stage extraction techniques, and the like.

As used herein, an “affinity set” is a set of molecules thatspecifically bind to one another. Affinity sets include, but are notlimited to, biotin and avidin, biotin and streptavidin, receptor andligand, antibody and ligand, antibody and antigen, and a polynucleotidesequence and its complement. One or more members of an affinity set maybe coupled to a solid support. Exemplary solid supports include, but arenot limited to, agarose, sepharose, magnetic beads, polystyrene,polyacrylamide, glass, membranes, silica, semiconductor materials,silicon, and organic polymers.

As used herein, “hybridization-based pullout”, or “HBP”, is a type ofaffinity separation wherein the affinity set is a polynucleotidesequence and its complement. HBP is a process wherein a nucleotidesequence is bound or immobilized to a solid support and is used toselectively adsorb its complement sequence (see, e.g., U.S. patentapplication Ser. No. 08/873,437 to O'Neill et al., filed Jun. 12, 1997).

Certain Exemplary Embodiments of the Invention

The present invention is directed to methods and kits for generating andanalyzing primer extension products. Such primer extension products aregenerated by incubating a reaction composition comprising at least oneuniversal nucleotide under appropriate conditions suitable for effectingprimer extension. According to certain embodiments, the reactioncomposition comprises at least one terminator and at least one universalnucleotide. According to certain embodiments, the invention providesmethods and kits for sequencing nucleic acids using a reactioncomposition comprising at least one universal nucleotide.

Exemplary Components

According to certain embodiments of the present invention, universalnucleotides comprise unnatural, predominantly hydrophobic bases that canpack efficiently in duplex DNA (see, e.g., Berger et al. Angew. Chem.Int Ed. Engl. (2000) 39: 2940-42; Wu et al. J. Am. Chem. Soc. (2000)122: 7621-32; Berger et al. Nuc. Acids Res. (2000) 28: 2911-14, Smith etal. Nucleosides & Nucleotides (1998) 17: 541-554, Ogawa et al. J. Am.Chem. Soc. (2000) 122:3274-87). According to certain embodiments, auniversal nucleotide may pair with two or more of the natural basesfound in DNA. According to certain embodiments of the invention,universal nucleotides may lack the specific hydrogen bonding interactionof natural base pairs, and therefore may substitute for two or morebases in a DNA strand simply by steric and hydrophobic interactions.

According to certain embodiments of the invention, the universalnucleotides include, but are not limited to,2′-deoxy-7-azaindole-5′-triphosphate (d7AITP),2′-deoxy-isocarbostyril-5′-triphosphate (dlCSTP),2′-deoxy-propynylisocarbostyril-5′-triphosphate (dPICSTP),2′-deoxy-6-methyl-7-azaindole-5′-triphosphate (dM7AITP),2′-deoxy-imidizopyridine-5′-triphosphate (dlmPyTP),2′-deoxy-pyrrollpyrizine-5′-triphosphate (dPPTP),2′-deoxy-allenyl-7-azaindole-5′-triphosphate (dA7AITP), or2′-deoxy-propynyl-7-azaindole-5′-triphosphate (dP7AITP). In certainembodiments, the universal nucleotides are utilized by a polymerase,e.g., a DNA polymerase, at a rate nearly equal to the rate at whichspecific nucleotides are incorporated.

Certain embodiments of the invention employ a polymerase that has beenoptimized for use with universal nucleotides. According to certainembodiments, methods of optimizing a polymerase include, but are notlimited to, site-directed mutagenesis, nonspecific mutagenesis, deletionof one or more amino acids, addition of one or more amino acids,substitution of one or more amino acids, and post-translationalmodifications, which include, but are not limited to, proteolysis,deletion of carbohydrate groups or phosphates, and addition ofcarbohydrate groups or phosphates. Thus, polymerases includenaturally-occurring polymerases and modified polymerases or variantpolymerases, including those modified for optimal incorporation ofuniversal nucleotides.

According to certain embodiments, a polymerase incorporates universalnucleotides into a primer extension product at a rate that is at least10% of the rate at which specific nucleotides are incorporated by thesame polymerase. In certain embodiments, universal nucleotides areincorporated by polymerase at a rate that is at least 25% the rate atwhich specific nucleotides are incorporated by the same polymerase. Incertain embodiments, universal nucleotides are incorporated at a ratethat is at least 50% the rate at which specific nucleotides areincorporated. In certain embodiments, universal nucleotides areincorporated at a rate that is at least 75% the rate at which specificnucleotides are incorporated. In certain embodiments, polymeraseincorporates universal nucleotides into a primer extension product at arate that is equal to, or substantially equal to, the rate at whichspecific nucleotides are incorporated. According to certain embodiments,polymerase incorporates universal nucleotides at a rate that issufficient to reduce premature chain termination.

Polymerases for use in the invention may or may not be thermostable. Incertain embodiments, polymerases have mutations that reducediscrimination against the incorporation of chain terminators that are3′-dideoxynucleotides as compared with nucleotide triphosphates. Incertain embodiments, one can use mutants having a Tyr residue atposition 667 (numbered with reference to Taq DNA polymerase). A detaileddescription of such mutants can be found, e.g., in U.S. Pat. No.5,614,365. Such mutant polymerases may conveniently be referred tocollectively as Y667 mutants.

According to certain embodiments, polymerases include, but are notlimited to, DNA polymerase, RNA polymerase, reverse transcriptase, T7polymerase, SP6 polymerase, T3 polymerase, Sequenase, Klenow fragment,AmpliTaq FS, a thermostable DNA polymerase with minimal or no 3′-5′exonuclease activity, or an enzymatically active variant or fragment ofany of the above polymerases. According to certain embodiments of theinvention, a mixture of two or more polymerases are used.

Primer Extension

Primer extension reactions according to certain embodiments, are used tomake a complementary copy of at least a portion of a target nucleic acidtemplate. In certain primer extension reactions, one uses an extensionreaction composition comprising a target nucleic acid template, at leastone primer, at least one universal nucleotide, and at least onepolymerase. The at least one primer anneals to the target template. Aprimer extension product is generated when the polymerase enzymaticallyadds one or more nucleotides to the 3′ end of the primer that isannealed to the target nucleic acid template.

The primer extension reaction may contain a combination of specificnucleotides and universal nucleotides, or it may contain exclusivelyuniversal nucleotides. The nucleotide that is added to the 3′ end of theprimer (or the 3′ end of the primer extension product being extendedfrom the primer) by polymerase may be a specific nucleotide, such thatit is added in a template sequence-specific manner and pairsspecifically with the template nucleotide opposite it. Alternatively,the nucleotide that is added by polymerase may be a universalnucleotide, which is added in a non-template sequence-specific manner.The polymerase adds nucleotides to the 3′ end of the growing primerextension product until it reaches the end of the target nucleic acidtemplate, or until it prematurely terminates before the end of thetarget nucleic acid template, e.g., by falling off the template, or byincorporation of a terminator, if present.

The result of the primer extension reaction is a primer extensionproduct, which comprises the primer at its 5′ end, covalently linked toa string of nucleotides that have been incorporated by polymerase. Incertain embodiments, the string of nucleotides may comprise exclusivelyuniversal nucleotides of one type, or may comprise exclusively universalnucleotides of more than one type. In certain exemplary embodiments, thestring of nucleotides may comprise a single type of universal nucleotidethat pairs with A, T, C, and G. In certain exemplary embodiments, thestring of nucleotides may comprise one type of universal nucleotide thatpairs with C and G and another type of universal nucleotide that pairswith A and T. In certain exemplary embodiments, the string ofnucleotides may comprise two different types of universal nucleotidesthat pair with A, T, C, and G. In certain embodiments, the string ofnucleotides may comprise a combination of universal nucleotides of onetype and specific nucleotides of one or more pairing types. In certainexemplary embodiments, the string of nucleotides may comprise auniversal nucleotide that pairs with C and G in the template, andspecific nucleotides A and T, which pair with T and A in the template,respectively. In certain embodiments, the string of nucleotides maycomprise a combination of universal nucleotides of more than one typeand specific nucleotides of one or more pairing types. In an exemplarycase, the string of nucleotides may comprise two types of universalnucleotides, one which pairs with C and G, and the other which pairswith G and A, and specific nucleotide A, which pairs with T in thetemplate.

According to certain embodiments, the primer extension reaction is partof a polymerase chain reaction (PCR). A general description of PCR isprovided, e.g., in PCR Protocols: A Guide to Methods and Applications,Academic Press, New York, N.Y. (1990); and in PCR Primers: A LaboratoryManual, Cold Spring Harbor Laboratory Press, N.Y. (1995). In PCR, thereaction composition includes at least one template, at least oneprimer, at least one polymerase, and extendable nucleotides. At leastone universal nucleotide is included in the reaction composition. Thereaction composition is subjected to cycles of temperature changes whichresult in a primer extension reaction that generates a primer extensionproduct complementary to at least a portion of the target template,separation of the primer extension reaction product from the template,annealing of a new primer to at least a portion of the template and/orto the primer extension product, and subsequent primer extensionreactions that generate primer extension products complementary to atleast a portion of the template and/or complementary to at least aportion of the previously generated primer extension products.

In certain embodiments, the reaction composition contains one or more“primer sets”, which comprise a forward primer and a reverse primer thatanneal to opposite strands of the same double-stranded template. Theforward primer anneals to one strand of the template, and the reverseprimer anneals to the other strand of the template, such that the primerextension product from a forward primer comprises a sequence that iscomplementary to at least a portion of the primer extension product fromthe reverse primer. In subsequent primer extension reactions, theforward primer may anneal to the primer extension product from thereverse primer, and the reverse primer may anneal to the primerextension product from the forward primer.

Asymmetric PCR (A-PCR) according to the present invention comprises anamplification reaction composition wherein (i) at least one primer setcomprises only a forward primer or only a reverse primer; (ii) there isan excess of one primer (relative to the other primer in a primer set);or (iii) at least one primer set wherein the Tm₅₀ of the first primer isat least 6-8° C., different from the Tm₅₀ of the second primer. Incertain embodiments, the Tm₅₀ of the first primer is at least 10-12° C.different from the Tm₅₀ of the second primer. Consequently, following aprimer extension reaction, an excess of products that are complementaryto at least a portion of one strand of the template are generatedrelative to products that are complementary to at least a portion of theother strand of the template.

In certain embodiments of the invention, the asymmetric PCR reactioncomposition comprises at least one primer set having at least oneforward primer, or at least one reverse primer, but typically not both.In such embodiments, primer extension reactions typically produce primerextension products that are complementary to at least a portion of onestrand of the template, but not products complementary to at least aportion of the other strand. In each subsequent round of primerextension reaction, a new primer anneals to the template to produce aprimer extension product. In certain embodiments, only the template, andnot the primer extension product, is amplified in each subsequent roundof asymmetric PCR.

In certain embodiments, the invention provides methods of asynchronousPCR. (See, e.g., U.S. patent application Ser. No. 09/875,211, filed Jun.5, 2001.)

In certain embodiments, one can amplify multiple target sequencessimultaneously using multiple sets of one or more primers specific foreach of the target sequences, which may be referred to as multiplex PCR.(See, e.g., H. Geada et al., Forensic Sci. Int. 108:31-37 (2000) and D.G. Wang et al., Science 280:1077-82 (1998)).

Microsatellites, including STRs and VNTRs, are regions in the genomethat contain a tandem array of a repeated sequence. The number ofrepeats may vary from individual to individual, or may be a marker fordisease, and therefore these regions may be used for diagnosticpurposes. The repeated sequence can be from 2 to about 80 nucleotideslong, and the number of repeats ranges into the hundreds. The analysisof microsatellites, including STRs, typically requires precisedetermination of the number of repeats, and the difference in repeatnumber between two target nucleic acid templates can be as little as oneor two.

In certain embodiments, STRs are analyzed by first amplifying the repeatregion using PCR, with primers that flank the repeat region. Theamplified primer extension products are then separated based on size. Byanalyzing the size of the products, one can determine the number ofrepeats in the STR being analyzed. In certain embodiments, one cananalyze more than one STR region in the same reaction composition byusing different labeled primers for each different STR region.

In certain embodiments, a reaction composition comprises at least onepolymerase, at least one primer set, at least one target nucleic acidtemplate, and at least one universal nucleotide. The reactioncomposition may or may not contain specific nucleotides. In certainembodiments, at least one primer in each primer set comprises a label.In certain embodiments, different primers for different templates havedifferent labels.

In the primer extension reaction, polymerase adds nucleotides to the 3′end of the primer or primer extension product. In certain embodiments,specific nucleotides are added to the 3′ end of the primer extensionproduct according to the sequence of the template, while universalnucleotides are added non-specifically. In certain embodiments, theresulting primer extension product contains the primer sequence at its5′ end, covalently linked to a string of nucleotides that have beenincorporated by polymerase. In certain embodiments, the string ofnucleotides may comprise exclusively universal nucleotides of one type,or may comprise exclusively universal nucleotides of more than one type.In certain embodiments, the string of nucleotides may comprise acombination of universal nucleotides of one type and specificnucleotides of one or more pairing types, or may comprise a combinationof universal nucleotides of more than one type and specific nucleotidesof one or more pairing types.

In certain embodiments, the primer extension products may be separatedby a mobility-dependent analysis technique, or MDAT. In certainembodiments, the primer extension products may be separated based on,e.g., molecular weight, length, sequence, and/or charge. Any method thatallows two or more nucleic acid sequences in a mixture to bedistinguished, e.g., based on mobility, length, molecular weight,sequence and/or charge, is within the scope of the invention. ExemplaryMDAT techniques include, without limitation, electrophoresis, such asgel or capillary electrophoresis, HPLC, mass spectroscopy, includingMALDI-TOF, and gel filtration. In certain embodiments, the MDAT iselectrophoresis or chromatography.

In certain embodiments, the identity of the label that is attached to aprimer extension product correlates with the identity of the primer, andtherefore correlates with the identity of the template to which itanneals, and, thus the identity of the region being analyzed. Also, byseparating the primer extension products, one may determine the numberof repeats in a given STR. In this manner, the primer extension productsof several different primer sets and target nucleic acid templates maybe compared in a single primer extension reaction.

In various embodiments, different numbers of pairing types of specificnucleotides may be employed. In certain embodiments, the reactioncomposition comprises four specific extendable nucleotides and at leastone universal nucleotide. In certain embodiments, the reactioncomposition comprises three specific extendable nucleotides and at leastone universal nucleotide. In certain embodiments, the reactioncomposition comprises two specific extendable nucleotides and at leastone universal nucleotide. In certain embodiments, the reactioncomposition comprises one specific extendable nucleotide and at leastone universal nucleotide.

In certain such embodiments, the polymerase incorporates the specificextendable nucleotides and the at least one universal nucleotide intothe primer extension product. In certain embodiments, at least some ofthe time, the at least one universal nucleotide, rather than a specificnucleotide, is incorporated by polymerase opposite one or more of thespecific nucleotides in the template sequence.

In certain embodiments, the reaction composition comprises no specificextendable nucleotides, and at least one universal nucleotide. The atleast one universal nucleotide is incorporated by polymerase into theprimer extension product opposite all of the specific nucleotides in thetemplate.

In certain embodiments of the invention, STRs are analyzed by asymmetricPCR. In certain embodiments of the invention, the reaction compositioncontains one or more primer sets, each of which contains only a forwardor only a reverse primer. In certain embodiments, each different primercomprises a different label. In certain embodiments, the primerextension reaction for at least one primer set results in primerextension products that are complementary to only one strand of thetemplate.

In certain embodiments of the invention, the primer is anoligonucleotide primer and the polynucleotide molecule for analysis isgenomic DNA or cDNA. In certain embodiments, annealing the primer andthe template, or duplex formation, may take place by hybridization. Theprimer/template duplex may contain one or more mismatches that do notsignificantly interfere with the ability of a polymerase to extend theprimer or interfere with the ability of the 3′ terminus nucleotide baseof the primer to hybridize immediately adjacent to a predeterminedlocation on the target nucleic acid template.

In certain embodiments, the initial target nucleic acid template isprocessed prior to amplifying it in the presence of universalnucleotides. In certain embodiments, the initial target nucleic acidtemplate is processed to create a nucleic acid that comprises an STR anda constant length flanking region on one or both ends of the STR. Incertain embodiments, such processed nucleic acids may be used as targetnucleic acid templates in subsequent extension reactions to createextension products that have constant length flanking regions on bothends of the STR and that vary in length by the number of nucleotides inthe STR.

In certain embodiments, the initial target nucleic acid template isprocessed by subjecting it to initial cycles of PCR in a first reactioncomposition comprising specific nucleotides and not comprising universalnucleotides. The resulting first primer extension products are thenamplified in the presence of universal nucleotides. In certainembodiments, one or more initial cycles of PCR are performed with afirst reaction composition that comprises specific nucleotides and bothforward and reverse primers that flank each STR region, but that doesnot comprise universal nucleotides. Such initial cycles generate variousfirst primer extension products that comprise flanking regions ofpredetermined length and the STR region. In certain embodiments, suchfirst primer extension products serve as target nucleic acid templatesfor subsequent cycles of PCR with a second reaction composition thatcomprises at least one universal nucleotide and at least one primer.Such subsequent cycles and the remainder of the methods for detectingSTRs, including separating and detecting of the second extensionproducts, may be carried out as discussed above.

In certain embodiments, most or all of the initial target nucleic acidtemplates are removed prior to subsequent cycles of PCR with a reactioncomposition that comprises at least one universal nucleotide, and thefirst primer extension products serve as templates in such subsequentcycles of PCR. In certain embodiments, the initial target nucleic acidtemplates are modified with a first member of an affinity set. Incertain embodiments, the initial target nucleic acid templates are boundto a second member of the affinity set before, during, or after theinitial cycles of PCR without universal nucleotides. In certainembodiments, the second member of the affinity set is coupled to a solidsupport so that most or all of the initial target nucleic acid templatesmay be separated from the reaction composition before the subsequentcycles of PCR with the at least one universal nucleotide. In certainembodiments, most or all of the initial target nucleic acid templatesare removed with hybridization-based pull-out (HBP).

In certain embodiments, the initial target nucleic acid template isprocessed by digestion with restriction endonucleases prior toamplification in the presence of universal nucleotides. In certainembodiments, one or more initial target nucleic acid templates compriseone or more STR regions. The initial target nucleic acid templates aredigested with one or more restriction endonucleases prior toamplification of the STR region by PCR in the presence of universalnucleotides. In certain embodiments, the initial target nucleic acidtemplates are digested in one or both regions flanking each STR regionthat is to be amplified to obtain target nucleic acid templates withconstant length flanking regions on one or both ends of the STR. Suchdigested target nucleic acid templates can then be used in extensionreactions to obtain primer extension products that have constant lengthflanking regions on both ends of the STR. The remainder of the methodsfor detecting STRs, including separating and detecting of the primerextension products, may be carried out as discussed above.

Analysis of microsatellites may be difficult if there is secondarystructure present during an MDAT, e.g., electrophoresis, massspectroscopy, or chromatography, which can cause aberrant mobility ofthe amplified products. According to certain embodiments, by replacingone or more of the extendable nucleotides in the primer extensionreaction with one or more universal nucleotides, secondary structureformation during separation may be reduced. According to certainembodiments, longer repeat regions than microsatellite regions withinthe genome may be analyzed using universal nucleotides in the primerextension reaction, e.g., as part of a PCR reaction. STRs and methods ofanalyzing them are described, e.g., in U.S. Pat. Nos. 5,364,759,5,075,217, 6,090,558 and 6,221,598, which are herein incorporated byreference for any purpose.

The sequence of a nucleic acid may be determined by the creation of aprimer extension product, e.g., by the method of Sanger (see, e.g.,Sanger et al. Proc. Nat. Acad. Sci 74: 5463-5467 (1977)). According tocertain embodiments, the present invention provides methods forsequencing nucleic acids using universal nucleotides in the reactioncomposition. In certain embodiments, a duplex (double strandedpolynucleotide) is formed between a target nucleic acid template and aprimer. The primer hybridizes to a predetermined location on the targetnucleic acid template. In certain embodiments, one or more extendablenucleotides, including at least one universal nucleotide, one or morepolymerases, and one or more specific terminators are included in thereaction composition with the primer. The reaction composition may ormay not contain specific nucleotides.

The reaction composition is incubated under appropriate reactionconditions, such that one or more extendable nucleotides areincorporated sequentially by polymerase onto the 3′ end of the primer.Specific nucleotides, if present, are added by polymerase in a templatesequence-specific manner, while universal nucleotides are added in anon-template sequence-specific manner. A specific terminator may beincorporated into the primer extension product, and once incorporated,prevents further incorporation of nucleotides to the 3′ end of theprimer extension product by polymerase. The primer extension productsgenerated by the primer extension reaction may then be separated basedon size, and the sequence of the nucleic acid template can be determinedfrom the particular sizes of the products and the identity of thespecific terminator on each product.

In certain embodiments, the reaction composition contains four differentspecific terminators, e.g., A terminators, T terminators, G terminators,and C terminators, each of which is coupled to a different label. Eachof the primer extension products that are generated therefore containsone of the four specific terminators at its 3′ end, and the identity ofthis terminator correlates with the identity of the label. Furthermore,the identity of the nucleotide on the template strand opposite theterminator can be determined by the identity of the terminator (andtherefore, the identity of the label). For example, if a primerextension product has a C terminator at its 3′ end, then the templatecontains a G opposite the terminator. The length of the primer extensionproduct determines where in the template sequence the G is located.

In certain embodiments using at least one universal nucleotide, theprimer in the reaction composition further comprises a label and theterminators are not labeled. In certain embodiments, each of fourdifferent reaction compositions includes a primer that anneals to thesame location on the template, but the primer in each of the differentreaction compositions comprises a different label. The primer hybridizesto a predetermined location on the target nucleic acid template.

In certain embodiments, one or more extendable nucleotides, including atleast one universal nucleotide, one or more polymerases, and one or morespecific terminators are included in the reaction composition. Incertain embodiments, a different unlabeled terminator is included ineach of the four reaction compositions. The reaction composition may ormay not contain specific nucleotides. The reaction composition isincubated under appropriate reaction conditions, such that one or moreextendable nucleotides are incorporated sequentially by polymerase ontothe 3′ end of the primer. Specific nucleotides, if present, are added bypolymerase in a template sequence-specific manner, while universalnucleotides are added in a non-specific manner.

In certain of those embodiments, each primer extension reactiongenerates primer extension products that have only one type ofterminator at their 3′ ends. The identity of the label that is coupledto the primer correlates to the identity of the terminator, andtherefore the identity of the nucleotide opposite the terminator on thetemplate. In certain embodiments, the primer extension products from thefour separate reactions may be combined. The products may then beanalyzed by an MDAT, e.g., separated based on size. The sequence of thetemplate may then be determined from the particular sizes of theproducts and the identity of the terminator of each product.

In certain embodiments, more than one template may be sequenced in thesame reaction composition. In certain embodiments, a reactioncomposition may contain two different primers that anneal to twodifferent templates, each of the different primers comprising adifferent label. In certain embodiments, the reaction composition mayfurther comprise four different terminators, each comprising a differentlabel. In certain embodiments, one or more extendable nucleotides,including at least one universal nucleotide, and one or more polymerasesare also included in the reaction composition. The reaction compositionmay or may not contain specific nucleotides. The reaction composition isincubated under appropriate reaction conditions, such that one or moreextendable nucleotides are incorporated sequentially by polymerase ontothe 3′ end of each of the different primers, according to the templatesequence to which each primer anneals. Specific nucleotides, if present,are added by polymerase in a template sequence-specific manner, whileuniversal nucleotides are added in a non-specific manner.

The primer extension reaction, therefore, generates primer extensionproducts that each have a label that identifies the primer that wasextended, and another label that identifies the terminator at the 3′end. In certain embodiments, the primer extension products may beseparated. The sequence of the template may be determined from theparticular sizes of the products, the identity of the primer, and theidentity of the terminator of each product. Therefore, the sequence ofeach of the two templates may be determined simultaneously.

In certain embodiments, the reaction composition may contain more thantwo different primers, each comprising a different label, that anneal todifferent templates. In certain embodiments, the reaction compositioncontains the different labeled primers and four different terminators,each comprising a different label. In certain embodiments, the reactionsare carried out substantially as described above for two differentprimers.

In certain embodiments, a reaction composition may comprise one type ofunlabeled terminator and two or more different labeled primers that arespecific for two or more different templates. In certain embodiments,four different reaction compositions each comprise a different unlabeledterminator and the two or more different labeled primers. In certainembodiments, each reaction composition is then subjected to a primerextension reaction. The extension product of each reaction compositionis then separated. The label will indicate which template correlates tothe extension product. The identity of the terminated nucleotide, andthus, the identity of the template nucleotide opposite it, can bedetermined based on the reaction composition from which the primerextension product was generated. The length of the product will indicatewhere the nucleotide is included in the template.

In various embodiments, different numbers of pairing types of specificnucleotides may be employed. In certain embodiments, the reactioncomposition comprises four specific extendable nucleotides and at leastone universal nucleotide. In certain embodiments, the reactioncomposition comprises three specific extendable nucleotides and at leastone universal nucleotide. In certain embodiments, the reactioncomposition comprises two specific extendable nucleotides and at leastone universal nucleotide. In certain embodiments, the reactioncomposition comprises one specific extendable nucleotide and at leastone universal nucleotide.

In certain such embodiments, the polymerase incorporates the specificextendable nucleotides and the at least one universal nucleotide intothe primer extension product. In certain embodiments, at least some ofthe time, the at least one universal nucleotide, rather than a specificnucleotide, is incorporated by polymerase opposite one or more of thespecific nucleotides in the template sequence.

In certain embodiments, the reaction composition comprises no specificextendable nucleotides, and at least one universal nucleotide. The atleast one universal nucleotide is incorporated by polymerase into theprimer extension product opposite all of the specific nucleotides in thetemplate.

In certain embodiments, the primer extension products may be separatedby a mobility-dependent analysis technique, or MDAT. In certainembodiments, the primer extension products may be separated based on,e.g., molecular weight, length, sequence, and/or charge. Any method thatallows two or more nucleic acid sequences in a mixture to bedistinguished, e.g., based on mobility, length, molecular weight,sequence and/or charge, is within the scope of the invention. Exemplaryseparation techniques include, without limitation, electrophoresis, suchas gel or capillary electrophoresis, HPLC, mass spectroscopy, includingMALDI-TOF, and gel filtration. In certain embodiments, the MDAT iselectrophoresis or chromatography. By separating the primer extensionproducts, one can determine the sequence of the template nucleic acidbased on the size of each product and the identity of the terminator atits 3′ end.

In certain embodiments of the invention, the primer is anoligonucleotide primer and the polynucleotide molecule for analysis isgenomic DNA or cDNA. In certain embodiments, annealing the primer andthe template, or duplex formation, may take place by hybridization. Theprimer/template duplex may contain one or more mismatches that do notsignificantly interfere with the ability of a polymerase to extend theprimer or interfere with the ability of the 3′ terminus nucleotide baseof the primer to hybridize immediately adjacent to a predeterminedlocation on the target nucleic acid template.

In certain embodiments, the methods include cycle sequencing, in which,following the primer extension reaction and termination, the primerextension product is released from the target nucleic acid template, anda new primer is annealed, extended, and terminated in the same manner.Cycle sequencing allows amplification of the primer extension products.In certain embodiments, cycle sequencing is performed using athermocycler apparatus.

In certain embodiments, the primer and/or the terminator is labeled. Incertain embodiments, the label comprises a fluorescent dye. In certainembodiments, the reaction contains four different terminators, eachlabeled with a different fluorescent dye. In certain embodiments, fourreaction compositions each contain a primer that is labeled with adifferent fluorescent dye. In certain embodiments, the four primers havethe same sequence. In certain embodiments, one reaction compositioncontains more than one different primer, and each different primer islabeled with a different fluorescent dye. In certain embodiments, theprimer extension products are separated, e.g. by electrophoresis, massspectroscopy, or chromatography.

DNA sequencing technology may be limited by variability that can resultfrom the differences between the four specific bases of DNA. Forexample, during separation of the sequencing products, compressions mayresult from secondary structure that occurs in regions of high G-Ccontent. These compressions can cause multiple products to run at thesame size, resulting in several primer extension product peaksoverlapping following electrophoresis, mass spectroscopy, orchromatography. In certain embodiments, the present invention providesmethods that may reduce secondary structure in primer extensionproducts, thereby reducing compressions, by replacing one or more of thedNTPs in the sequencing reaction with at least one universal nucleotide.

Also, the use of at least one universal nucleotide according to certainembodiments may reduce premature chain termination in an extensionreaction. Premature chain termination is termination of the extensionreaction prior to incorporation of a terminator in the extensionproduct.

Kits

The invention also provides kits for performing the foregoing methods.In certain embodiments, kits serve to expedite the performance of themethods of interest by assembling two or more components used to carryout the methods. In certain embodiments, kits contain components inpre-measured unit amounts to minimize the need for measurements byend-users. In certain embodiments, kits include instructions forperforming one or more methods of the invention. In certain embodiments,the kit components are optimized to operate in conjunction with oneanother.

In certain embodiments, the kits of the invention may be used tosequence at least one target nucleic acid template. In certainembodiments, the kits for sequencing target nucleic acid templatesinclude at least one universal nucleotide, at least one polymerase, andat least one specific terminator. In certain embodiments, kits forsequencing target nucleic acid templates may contain additionalcomponents, including, but not limited to, at least one primer. Incertain embodiments, the at least one specific terminator and/or the atleast one primer may further comprise a label. Kits may also include thereagents for performing a control reaction, which may include one ormore of the above components, and at least one target nucleic acidtemplate.

In certain embodiments, the kits of the invention may be used togenerate a plurality of primer extension products. In certainembodiments, the kits may be used for STR analysis. In certainembodiments, kits for STR analysis include at least one universalnucleotide and at least one polymerase. In certain embodiments, kits forSTR analysis may include at least one primer. In certain embodiments,the at least one primer may further comprise a label. Kits for STRanalysis may also include the reagents for performing a controlreaction, which may include one or more of the above components, and atleast one target nucleic acid template.

Other embodiments of the invention will be apparent to those skilled inthe art from consideration of the specification and practice of theinvention disclosed herein. It is intended that the specification andexamples be considered as exemplary only.

1. A method of sequencing at least one target nucleic acid templatecomprising: (a) forming a reaction composition comprising at least onetarget nucleic acid template, at least one primer, at least onepolymerase, at least one universal nucleotide, and at least one specificterminator comprising a label; (b) incubating the reaction compositionunder appropriate conditions to generate at least one primer extensionproduct comprising one or more of the at least one universal nucleotidesand one or more of the at least one specific terminators; (c) separatingone or more of the at least one primer extension products, wherein theseparating comprises at least one mobility-dependent analysis technique(MDAT); and (d) detecting one or more of the at least one primerextension products.
 2. The method of claim 1, wherein the reactioncomposition further comprises at least one specific extendablenucleotide.
 3. The method of claim 1, wherein the reaction compositiondoes not include a specific extendable nucleotide.
 4. The method ofclaim 1, wherein the at least one universal nucleotide is one type ofuniversal nucleotide.
 5. The method of claim 1, wherein the at least oneuniversal nucleotide are two types of universal nucleotides.
 6. Themethod of claim 1, wherein the at least one universal nucleotide arethree types of universal nucleotides.
 7. The method of claim 1, whereinone or more of the at least one universal nucleotides is individuallyselected from 2′-deoxy-7-azaindole-5′-triphosphate (d7AITP),2′-deoxy-isocarbostyril-5′-triphosphate (dlCSTP),2′-deoxy-propynylisocarbostyril-5′-triphosphate (dPICSTP),2′-deoxy-6-methyl-7-azaindole-5′-triphosphate (dM7AITP),2′-deoxy-imidizopyridine-5′-triphosphate (dlmPyTP),2′-deoxy-pyrrollpyrizine-5′-triphosphate (dPPTP),2′-deoxy-allenyl-7-azaindole-5′-triphosphate (dA7AITP), and2′-deoxy-propynyl-7-azaindole-5′-triphosphate (dP7AITP).
 8. The methodof claim 1, wherein the at least one specific terminator comprises an Aterminator comprising a first label, a T terminator comprising a secondlabel, a G terminator comprising a third label, and a C terminatorcomprising a fourth label, wherein the first, second, third, and fourthlabels are detectably different.
 9. The method of claim 1, wherein theMDAT comprises at least one of electrophoresis, chromatography, HPLC,mass spectroscopy, sedimentation, field-flow fractionation, ormulti-stage fractionation.
 10. The method of claim 1, wherein the MDATcomprises electrophoresis.
 11. The method of claim 1, wherein the labelcomprises a fluorescent dye.
 12. The method of claim 11, wherein thelabel comprises an energy-transfer fluorescent dye.
 13. A method ofsequencing at least one target nucleic acid template comprising: (a)forming at least one reaction composition comprising at least one targetnucleic acid template, at least one primer comprising a label, at leastone polymerase, at least one universal nucleotide, and at least onespecific terminator; (b) incubating the reaction composition underappropriate conditions to generate at least one primer extension productcomprising one or more of the at least one universal nucleotides and oneor more of the at least one specific terminators; (c) separating one ormore of the at least one primer extension products, wherein theseparating comprises at least one mobility-dependent analysis technique(MDAT); and (d) detecting one or more of the at least one primerextension products.
 14. The method of claim 13, wherein the reactioncomposition further comprises at least one specific extendablenucleotide.
 15. The method of claim 13, wherein the reaction compositiondoes not include a specific extendable nucleotide.
 16. The method ofclaim 13, wherein the at least one universal nucleotide is one type ofuniversal nucleotide.
 17. The method of claim 13, wherein the at leastone universal nucleotide are two types of universal nucleotides.
 18. Themethod of claim 13, wherein the at least one universal nucleotide arethree types of universal nucleotides.
 19. The method of claim 13,wherein one or more of the at least one universal nucleotides isindividually selected from d7AITP, dlCSTP, dPICSTP, dM7AITP, dlmPyTP,dPPTP, dA7AITP, and dP7AITP.
 20. The method of claim 13, wherein thelabel comprises a fluorescent dye.
 21. The method of claim 20, whereinthe label comprises an energy-transfer fluorescent dye.
 22. The methodof claim 13, wherein the at least one reaction composition comprises afirst reaction composition comprising a primer comprising a first labeland a first specific terminator, a second reaction compositioncomprising a primer comprising a second label and a second specificterminator, a third reaction composition comprising a primer comprisinga third label and a third specific terminator, and a fourth reactioncomposition comprising a primer comprising a fourth label and a fourthspecific terminator, wherein the first, second, third, and fourth labelsare different, and wherein the first, second, third, and fourthterminators are different.
 23. The method of claim 13, wherein the MDATcomprises at least one of electrophoresis, chromatography, HPLC, massspectroscopy, sedimentation, field-flow fractionation, or multi-stagefractionation.
 24. The method of claim 13, wherein the MDAT compriseselectrophoresis.
 25. A method of detecting a plurality of primerextension products comprising: (a) forming a reaction compositioncomprising at least one target nucleic acid template, at least oneprimer comprising a label, at least one polymerase, and at least oneuniversal nucleotide; (b) incubating the reaction composition underappropriate conditions to generate at least one primer extension productcomprising one or more of the at least one universal nucleotides; (c)releasing one or more of the at least one primer extension products fromone or more of the at least one target nucleic acid templates; (d)repeating (b) and (c) to generate a plurality of primer extensionproducts; (e) separating one or more of the plurality of primerextension products, wherein the separating comprises at least onemobility-dependent analysis technique (MDAT); and (f) detecting one ormore of the plurality of primer extension products.
 26. The method ofclaim 25, wherein one or more of the at least one target nucleic acidtemplates has been cut with at least one restriction endonuclease. 27.The method of claim 25, wherein one or more of the at least one targetnucleic acid templates comprises a short tandem repeat.
 28. The methodof claim 25, wherein the reaction composition further comprises at leastone specific extendable nucleotide.
 29. The method of claim 25, whereinthe reaction composition does not include a specific extendablenucleotide.
 30. The method of claim 25, wherein the at least oneuniversal nucleotide is one type of universal nucleotide.
 31. The methodof claim 25, wherein the at least one universal nucleotide are two typesof universal nucleotides.
 32. The method of claim 25, wherein the atleast one universal nucleotide are three types of universal nucleotides.33. The method of claim 25, wherein one or more of the at least oneuniversal nucleotides is individually selected from d7AITP, dlCSTP,dPICSTP, dM7AITP, dlmPyTP, dPPTP, dA7AITP and dP7AITP.
 34. The method ofclaim 25, wherein the label comprises a fluorescent dye.
 35. The methodof claim 34, wherein the label comprises an energy-transfer fluorescentdye.
 36. The method of claim 25, wherein the at least one target nucleicacid template comprises a first target nucleic acid template and asecond target nucleic acid template, wherein the first and second targetnucleic acid templates are different, and wherein the at least oneprimer comprises (i) a first primer that is specific for the firsttarget nucleic acid template and that comprises a first label and (ii) asecond primer that is specific for the second target nucleic acidtemplate and that comprises a second label, and wherein the first andsecond labels are detectably different.
 37. A method of detecting aplurality of second primer extension products comprising: (a) forming afirst reaction composition comprising at least one target nucleic acidtemplate, at least one first primer, at least one specific nucleotide,and at least one first polymerase, wherein the first reactioncomposition does not include a universal nucleotide; (b) incubating thereaction composition under appropriate conditions to generate at leastone first primer extension product; (c) releasing the one or more of atleast one first primer extension products from one or more of the atleast one target nucleic acid templates; (d) repeating (b) and (c) togenerate a plurality of first primer extension products; (e) forming asecond reaction composition comprising one or more of the plurality offirst primer extension products, at least one second primer comprising alabel, at least one second polymerase, and at least one universalnucleotide; (f) incubating the reaction composition under appropriateconditions to generate at least one second primer extension product; (g)releasing one or more of the at least one second primer extensionproducts from one or more of the at least one first primer extensionproducts; (h) repeating (f) and (g) to generate a plurality of secondprimer extension products; (i) separating one or more of the pluralityof second primer extension products, wherein the separating comprises atleast one mobility-dependent analysis technique (MDAT); and (j)detecting one or more of the plurality of second primer extensionproducts.
 38. The method of claim 37, wherein one or more of the atleast one target nucleic acid templates comprises a short tandem repeat.39. The method of claim 37, wherein the second reaction compositionfurther comprises at least one specific extendable nucleotide.
 40. Themethod of claim 37, wherein the second reaction composition does notinclude a specific extendable nucleotide.
 41. The method of claim 37,wherein the at least one universal nucleotide is one type of universalnucleotide.
 42. The method of claim 37, wherein the at least oneuniversal nucleotide are two types of universal nucleotides.
 43. Themethod of claim 37, wherein the at least one universal nucleotide arethree types of universal nucleotides.
 44. The method of claim 37,wherein one or more of the at least one universal nucleotides isindividually selected from d7AITP, dlCSTP, dPICSTP, dM7AITP, dlmPyTP,dPPTP, dA7AITP, and dP7AITP.
 45. The method of claim 37, wherein thelabel comprises a fluorescent dye.
 46. The method of claim 45, whereinthe label comprises an energy-transfer fluorescent dye.
 47. The methodof claim 37, further comprising removing the target nucleic acidtemplate prior to forming the second reaction composition.
 48. Themethod of claim 1, wherein (b) further comprises releasing one or moreof the at least one primer extension products from one or more of the atleast one target nucleic acid templates, and said incubating andreleasing are repeated at least one additional time.